AIML Capstone Project - CV- Car Detection

DOMAIN: Automotive Surveillance

• CONTEXT:

Computer vision can be used to automate supervision and generate action appropriate action trigger if the event is predicted from the image of interest. For example a car moving on the road can be easily identified by a camera as make of the car, type, colour, number plates etc.

• DATA DESCRIPTION:

The Cars dataset contains 16,185 images of 196 classes of cars. The data is split into 8,144 training images and 8,041 testing images, where each class has been split roughly in a 50-50 split. Classes are typically at the level of Make, Model, Year, e.g. 2012 Tesla Model S or 2012 BMW M3 coupe. Data description: ‣ Train Images: Consists of real images of cars as per the make and year of the car. ‣ Test Images: Consists of real images of cars as per the make and year of the car. ‣ Train Annotation: Consists of bounding box region for training images. ‣ Test Annotation: Consists of bounding box region for testing images.

Dataset has been attached along with this project. Please use the same for this capstone project.

Dataset: https://drive.google.com/drive/folders/1y6JWx2CpsOuka00uePe72jNgr7F9sK45?usp=sharing Original dataset link for your reference only: https://www.kaggle.com/jutrera/stanford-car-dataset-by-classes-folder Reference: 3D Object Representations for Fine-Grained Categorisation, Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei 4th IEEE Workshop on 3D Representation and Recognition, at ICCV 2013 (3dRR-13). Sydney, Australia. Dec. 8, 2013.

• PROJECT OBJECTIVE: Design a DL based car identification model.

PROJECT TASK: [ Duration: 6 weeks, Score: 100 points]

1. Milestone 1: [ Duration: 2 weeks, Score: 20 points]

‣ Process:

‣ Step 1: Import the data

‣ Step 2: Map training and testing images to its classes.

‣ Step 3: Map training and testing images to its annotations. ‣ Step 4: Display images with bounding box

‣ Output: Images mapped to its class and annotation ready to be used for deep learning

Visualizing Training images

Total classes in Training Data set

We can see that training images has 8144 images. and has 196 classes. Let us create the testing data set now.

We can see that testing images has 8041 images.

We will sort, test and train images by image name.

Let us load the annotations now.

We can see that the training annotations have bounding box coordinates of all the 8144 images.

We can see that the testing annotations have bounding box coordinates of all the 8041 images.

We can see that the annotations dataset has Unamed columns for bounding box coordinates, let us assume it is in the sequence of x1,x2,y1 and y2 and visualize one image to verify it.

Mapping Train and Test images to its annotations and classes now.

We can see that train images are mapped to its classes and annotations.

We will rename Unnamed columns to X1,Y1, X2, Y2 accordingly.

Now Let us visualize train images with its bounding box.

We can see that train images are mapped to its classes and annotations. Let us visualize 5 test images with its bounding box.

EDA and Data Visualizations

Importing Car names.

Here we have got the desired output i.e. Images mapped to its class and annotation ready to be used for deep learning. We have also visualized train and test images with bounding box. We have also created the train, validation and test data set for building CNN model for Cars Classification

Milestone 2: [ Duration: 2 weeks, Score: 20 points]

‣ Input: Output of milestone 1 ‣ Process:

‣ Step 1: Design, train and test CNN models to classify the car.

‣ Step 2: Design, train and test RCNN & its hybrids based object detection models to impose the bounding box or mask over the area of interest.

‣ Output: Pickled model to be used for future prediction

‣ Submission: Interim report

Let us check train and test csv data

CNN Model using Transfer Learning We are using the built-in module for Inception-ResNet V2 in tf.keras. We are not including the top and we won't import the output layer. Instead we are adding output classes for class classification.

Here we can see that image class is coming as 108, but actual one is 181. Let us try with some more test images

We can see that only above rows got predicted correctly. This CNN model is not performing accuratly. We will try with another model now.

Object Localization with ResNet as base

Build Model

Evaluation Metric IOU

Tensor Flow Object Detection API

Tensorflow object detection API requires data in tfrecord format. This can be done using generate_tfrecord.py file.

The script file requires 3 inputs

--csv_input= : where is csv file located which was prepared in previous step

--img_path= : where are the actual images stored

--output_path= : where the script can save the generated tfrecord file and what should be file name. We will run script for training and test csv separately to create two tfrecord files.

TensorFlow model zoo. https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md

Prepare Training configuration file

Change num_classes parameter to 196

For 'train_input_reader' change 'input_path' to filepath of train.record file.

For 'train_input_reader' change 'label_map_path' to filepath of label_map.txt file.

Repeat above two steps for 'eval_input_reader'.

Change fine_tune_checkpoint to filepath

Change 'batch_size' accordingly to available memory.

Change 'num_steps' to indicate how long the training will done e.g. 200000.

sample config files are available at models/research/object_detection/samples/configs

Visualize model output

total_loss for the trained model settled at 1.471968

This can be improved by adjusting hyper parameters and train for more number of epoches.

Modified the Base Learning rate & warmup learning rate in the config file and proceeding with model training

TUNING MODEL

Below are the best results after 190000 epcohes.

  Detection Boxes_Precision/mAP: 0.821089

  Detection Boxes_Precision/mAP@.50IOU: 0.907789

  Detection Boxes_Precision/mAP@.75IOU: 0.900015

  Loss/localization_loss: 0.017242 

  Loss/classification_loss: 0.212154 

  Loss/regularization_loss: 0.123719 

  Loss/total_loss: 0.353114

model trained on custom cars dataset using SSD mobilenet achieved mAP of 90% at 0.75 IOU.

FLASK MODEL

Exported model from tf2 is saved in the path : models/detection_model/saved_model

We have 2 pages for web aplication : Login and Home along with prediction.

Requirement: Have to create a folder named uploads to store the uplaoded images for predcition.

Static folder contains images and CSS files. Located in https://github.com/girijeshcse/autovision/tree/main/static

HTML Pages are located in below path: https://github.com/girijeshcse/autovision/tree/main/templates

Home page is shown as below with Team information and "choose File" Option to predict image

image3.JPG

Once the image is uploaded, Preview can be seen as below

image1.JPG

Images are classified within bounding box are shown as below upon clicking identify car Button.

image2.JPG